TITLE by Hana AlMashari

About Data

The dataset dimensions are 113937, 82

Exploring the Data Types of the Vairables:

## 'data.frame':    113937 obs. of  81 variables:
##  $ ListingKey                         : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
##  $ ListingNumber                      : int  193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
##  $ ListingCreationDate                : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
##  $ CreditGrade                        : Factor w/ 9 levels "","A","AA","B",..: 5 1 8 1 1 1 1 1 1 1 ...
##  $ Term                               : int  36 36 36 36 36 60 36 36 36 36 ...
##  $ LoanStatus                         : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
##  $ ClosedDate                         : Factor w/ 2803 levels "","2005-11-25 00:00:00",..: 1138 1 1263 1 1 1 1 1 1 1 ...
##  $ BorrowerAPR                        : num  0.165 0.12 0.283 0.125 0.246 ...
##  $ BorrowerRate                       : num  0.158 0.092 0.275 0.0974 0.2085 ...
##  $ LenderYield                        : num  0.138 0.082 0.24 0.0874 0.1985 ...
##  $ EstimatedEffectiveYield            : num  NA 0.0796 NA 0.0849 0.1832 ...
##  $ EstimatedLoss                      : num  NA 0.0249 NA 0.0249 0.0925 ...
##  $ EstimatedReturn                    : num  NA 0.0547 NA 0.06 0.0907 ...
##  $ ProsperRating..numeric.            : int  NA 6 NA 6 3 5 2 4 7 7 ...
##  $ ProsperRating..Alpha.              : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...
##  $ ProsperScore                       : num  NA 7 NA 9 4 10 2 4 9 11 ...
##  $ ListingCategory..numeric.          : int  0 2 0 16 2 1 1 2 7 7 ...
##  $ BorrowerState                      : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
##  $ Occupation                         : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
##  $ EmploymentStatus                   : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
##  $ EmploymentStatusDuration           : int  2 44 NA 113 44 82 172 103 269 269 ...
##  $ IsBorrowerHomeowner                : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
##  $ CurrentlyInGroup                   : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
##  $ GroupKey                           : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
##  $ DateCreditPulled                   : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
##  $ CreditScoreRangeLower              : int  640 680 480 800 680 740 680 700 820 820 ...
##  $ CreditScoreRangeUpper              : int  659 699 499 819 699 759 699 719 839 839 ...
##  $ FirstRecordedCreditLine            : Factor w/ 11586 levels "","1947-08-24 00:00:00",..: 8639 6617 8927 2247 9498 497 8265 7685 5543 5543 ...
##  $ CurrentCreditLines                 : int  5 14 NA 5 19 21 10 6 17 17 ...
##  $ OpenCreditLines                    : int  4 14 NA 5 19 17 7 6 16 16 ...
##  $ TotalCreditLinespast7years         : int  12 29 3 29 49 49 20 10 32 32 ...
##  $ OpenRevolvingAccounts              : int  1 13 0 7 6 13 6 5 12 12 ...
##  $ OpenRevolvingMonthlyPayment        : num  24 389 0 115 220 1410 214 101 219 219 ...
##  $ InquiriesLast6Months               : int  3 3 0 0 1 0 0 3 1 1 ...
##  $ TotalInquiries                     : num  3 5 1 1 9 2 0 16 6 6 ...
##  $ CurrentDelinquencies               : int  2 0 1 4 0 0 0 0 0 0 ...
##  $ AmountDelinquent                   : num  472 0 NA 10056 0 ...
##  $ DelinquenciesLast7Years            : int  4 0 0 14 0 0 0 0 0 0 ...
##  $ PublicRecordsLast10Years           : int  0 1 0 0 0 0 0 1 0 0 ...
##  $ PublicRecordsLast12Months          : int  0 0 NA 0 0 0 0 0 0 0 ...
##  $ RevolvingCreditBalance             : num  0 3989 NA 1444 6193 ...
##  $ BankcardUtilization                : num  0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
##  $ AvailableBankcardCredit            : num  1500 10266 NA 30754 695 ...
##  $ TotalTrades                        : num  11 29 NA 26 39 47 16 10 29 29 ...
##  $ TradesNeverDelinquent..percentage. : num  0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
##  $ TradesOpenedLast6Months            : num  0 2 NA 0 2 0 0 0 1 1 ...
##  $ DebtToIncomeRatio                  : num  0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
##  $ IncomeRange                        : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
##  $ IncomeVerifiable                   : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
##  $ StatedMonthlyIncome                : num  3083 6125 2083 2875 9583 ...
##  $ LoanKey                            : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
##  $ TotalProsperLoans                  : int  NA NA NA NA 1 NA NA NA NA NA ...
##  $ TotalProsperPaymentsBilled         : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ OnTimeProsperPayments              : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ ProsperPaymentsLessThanOneMonthLate: int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPaymentsOneMonthPlusLate    : int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPrincipalBorrowed           : num  NA NA NA NA 11000 NA NA NA NA NA ...
##  $ ProsperPrincipalOutstanding        : num  NA NA NA NA 9948 ...
##  $ ScorexChangeAtTimeOfListing        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanCurrentDaysDelinquent          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ LoanFirstDefaultedCycleNumber      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanMonthsSinceOrigination         : int  78 0 86 16 6 3 11 10 3 3 ...
##  $ LoanNumber                         : int  19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
##  $ LoanOriginalAmount                 : int  9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
##  $ LoanOriginationDate                : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
##  $ LoanOriginationQuarter             : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
##  $ MemberKey                          : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
##  $ MonthlyLoanPayment                 : num  330 319 123 321 564 ...
##  $ LP_CustomerPayments                : num  11396 0 4187 5143 2820 ...
##  $ LP_CustomerPrincipalPayments       : num  9425 0 3001 4091 1563 ...
##  $ LP_InterestandFees                 : num  1971 0 1186 1052 1257 ...
##  $ LP_ServiceFees                     : num  -133.2 0 -24.2 -108 -60.3 ...
##  $ LP_CollectionFees                  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_GrossPrincipalLoss              : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NetPrincipalLoss                : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NonPrincipalRecoverypayments    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PercentFunded                      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ Recommendations                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsCount         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsAmount        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Investors                          : int  258 1 41 158 20 1 1 1 1 1 ...
##                    ListingKey     ListingNumber    
##  17A93590655669644DB4C06:     6   Min.   :      4  
##  349D3587495831350F0F648:     4   1st Qu.: 400919  
##  47C1359638497431975670B:     4   Median : 600554  
##  8474358854651984137201C:     4   Mean   : 627886  
##  DE8535960513435199406CE:     4   3rd Qu.: 892634  
##  04C13599434217079754AEE:     3   Max.   :1255725  
##  (Other)                :113912                    
##                     ListingCreationDate  CreditGrade         Term      
##  2013-10-02 17:20:16.550000000:     6          :84984   Min.   :12.00  
##  2013-08-28 20:31:41.107000000:     4   C      : 5649   1st Qu.:36.00  
##  2013-09-08 09:27:44.853000000:     4   D      : 5153   Median :36.00  
##  2013-12-06 05:43:13.830000000:     4   B      : 4389   Mean   :40.83  
##  2013-12-06 11:44:58.283000000:     4   AA     : 3509   3rd Qu.:36.00  
##  2013-08-21 07:25:22.360000000:     3   HR     : 3508   Max.   :60.00  
##  (Other)                      :113912   (Other): 6745                  
##                  LoanStatus                  ClosedDate   
##  Current              :56576                      :58848  
##  Completed            :38074   2014-03-04 00:00:00:  105  
##  Chargedoff           :11992   2014-02-19 00:00:00:  100  
##  Defaulted            : 5018   2014-02-11 00:00:00:   92  
##  Past Due (1-15 days) :  806   2012-10-30 00:00:00:   81  
##  Past Due (31-60 days):  363   2013-02-26 00:00:00:   78  
##  (Other)              : 1108   (Other)            :54633  
##   BorrowerAPR       BorrowerRate     LenderYield     
##  Min.   :0.00653   Min.   :0.0000   Min.   :-0.0100  
##  1st Qu.:0.15629   1st Qu.:0.1340   1st Qu.: 0.1242  
##  Median :0.20976   Median :0.1840   Median : 0.1730  
##  Mean   :0.21883   Mean   :0.1928   Mean   : 0.1827  
##  3rd Qu.:0.28381   3rd Qu.:0.2500   3rd Qu.: 0.2400  
##  Max.   :0.51229   Max.   :0.4975   Max.   : 0.4925  
##  NA's   :25                                          
##  EstimatedEffectiveYield EstimatedLoss   EstimatedReturn 
##  Min.   :-0.183          Min.   :0.005   Min.   :-0.183  
##  1st Qu.: 0.116          1st Qu.:0.042   1st Qu.: 0.074  
##  Median : 0.162          Median :0.072   Median : 0.092  
##  Mean   : 0.169          Mean   :0.080   Mean   : 0.096  
##  3rd Qu.: 0.224          3rd Qu.:0.112   3rd Qu.: 0.117  
##  Max.   : 0.320          Max.   :0.366   Max.   : 0.284  
##  NA's   :29084           NA's   :29084   NA's   :29084   
##  ProsperRating..numeric. ProsperRating..Alpha.  ProsperScore  
##  Min.   :1.000                  :29084         Min.   : 1.00  
##  1st Qu.:3.000           C      :18345         1st Qu.: 4.00  
##  Median :4.000           B      :15581         Median : 6.00  
##  Mean   :4.072           A      :14551         Mean   : 5.95  
##  3rd Qu.:5.000           D      :14274         3rd Qu.: 8.00  
##  Max.   :7.000           E      : 9795         Max.   :11.00  
##  NA's   :29084           (Other):12307         NA's   :29084  
##  ListingCategory..numeric. BorrowerState  
##  Min.   : 0.000            CA     :14717  
##  1st Qu.: 1.000            TX     : 6842  
##  Median : 1.000            NY     : 6729  
##  Mean   : 2.774            FL     : 6720  
##  3rd Qu.: 3.000            IL     : 5921  
##  Max.   :20.000                   : 5515  
##                            (Other):67493  
##                     Occupation         EmploymentStatus
##  Other                   :28617   Employed     :67322  
##  Professional            :13628   Full-time    :26355  
##  Computer Programmer     : 4478   Self-employed: 6134  
##  Executive               : 4311   Not available: 5347  
##  Teacher                 : 3759   Other        : 3806  
##  Administrative Assistant: 3688                : 2255  
##  (Other)                 :55456   (Other)      : 2718  
##  EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
##  Min.   :  0.00           False:56459         False:101218    
##  1st Qu.: 26.00           True :57478         True : 12719    
##  Median : 67.00                                               
##  Mean   : 96.07                                               
##  3rd Qu.:137.00                                               
##  Max.   :755.00                                               
##  NA's   :7625                                                 
##                     GroupKey                 DateCreditPulled 
##                         :100596   2013-12-23 09:38:12:     6  
##  783C3371218786870A73D20:  1140   2013-11-21 09:09:41:     4  
##  3D4D3366260257624AB272D:   916   2013-12-06 05:43:16:     4  
##  6A3B336601725506917317E:   698   2014-01-14 20:17:49:     4  
##  FEF83377364176536637E50:   611   2014-02-09 12:14:41:     4  
##  C9643379247860156A00EC0:   342   2013-09-27 22:04:54:     3  
##  (Other)                :  9634   (Other)            :113912  
##  CreditScoreRangeLower CreditScoreRangeUpper
##  Min.   :  0.0         Min.   : 19.0        
##  1st Qu.:660.0         1st Qu.:679.0        
##  Median :680.0         Median :699.0        
##  Mean   :685.6         Mean   :704.6        
##  3rd Qu.:720.0         3rd Qu.:739.0        
##  Max.   :880.0         Max.   :899.0        
##  NA's   :591           NA's   :591          
##         FirstRecordedCreditLine CurrentCreditLines OpenCreditLines
##                     :   697     Min.   : 0.00      Min.   : 0.00  
##  1993-12-01 00:00:00:   185     1st Qu.: 7.00      1st Qu.: 6.00  
##  1994-11-01 00:00:00:   178     Median :10.00      Median : 9.00  
##  1995-11-01 00:00:00:   168     Mean   :10.32      Mean   : 9.26  
##  1990-04-01 00:00:00:   161     3rd Qu.:13.00      3rd Qu.:12.00  
##  1995-03-01 00:00:00:   159     Max.   :59.00      Max.   :54.00  
##  (Other)            :112389     NA's   :7604       NA's   :7604   
##  TotalCreditLinespast7years OpenRevolvingAccounts
##  Min.   :  2.00             Min.   : 0.00        
##  1st Qu.: 17.00             1st Qu.: 4.00        
##  Median : 25.00             Median : 6.00        
##  Mean   : 26.75             Mean   : 6.97        
##  3rd Qu.: 35.00             3rd Qu.: 9.00        
##  Max.   :136.00             Max.   :51.00        
##  NA's   :697                                     
##  OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries   
##  Min.   :    0.0             Min.   :  0.000      Min.   :  0.000  
##  1st Qu.:  114.0             1st Qu.:  0.000      1st Qu.:  2.000  
##  Median :  271.0             Median :  1.000      Median :  4.000  
##  Mean   :  398.3             Mean   :  1.435      Mean   :  5.584  
##  3rd Qu.:  525.0             3rd Qu.:  2.000      3rd Qu.:  7.000  
##  Max.   :14985.0             Max.   :105.000      Max.   :379.000  
##                              NA's   :697          NA's   :1159     
##  CurrentDelinquencies AmountDelinquent   DelinquenciesLast7Years
##  Min.   : 0.0000      Min.   :     0.0   Min.   : 0.000         
##  1st Qu.: 0.0000      1st Qu.:     0.0   1st Qu.: 0.000         
##  Median : 0.0000      Median :     0.0   Median : 0.000         
##  Mean   : 0.5921      Mean   :   984.5   Mean   : 4.155         
##  3rd Qu.: 0.0000      3rd Qu.:     0.0   3rd Qu.: 3.000         
##  Max.   :83.0000      Max.   :463881.0   Max.   :99.000         
##  NA's   :697          NA's   :7622       NA's   :990            
##  PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
##  Min.   : 0.0000          Min.   : 0.000            Min.   :      0       
##  1st Qu.: 0.0000          1st Qu.: 0.000            1st Qu.:   3121       
##  Median : 0.0000          Median : 0.000            Median :   8549       
##  Mean   : 0.3126          Mean   : 0.015            Mean   :  17599       
##  3rd Qu.: 0.0000          3rd Qu.: 0.000            3rd Qu.:  19521       
##  Max.   :38.0000          Max.   :20.000            Max.   :1435667       
##  NA's   :697              NA's   :7604              NA's   :7604          
##  BankcardUtilization AvailableBankcardCredit  TotalTrades    
##  Min.   :0.000       Min.   :     0          Min.   :  0.00  
##  1st Qu.:0.310       1st Qu.:   880          1st Qu.: 15.00  
##  Median :0.600       Median :  4100          Median : 22.00  
##  Mean   :0.561       Mean   : 11210          Mean   : 23.23  
##  3rd Qu.:0.840       3rd Qu.: 13180          3rd Qu.: 30.00  
##  Max.   :5.950       Max.   :646285          Max.   :126.00  
##  NA's   :7604        NA's   :7544            NA's   :7544    
##  TradesNeverDelinquent..percentage. TradesOpenedLast6Months
##  Min.   :0.000                      Min.   : 0.000         
##  1st Qu.:0.820                      1st Qu.: 0.000         
##  Median :0.940                      Median : 0.000         
##  Mean   :0.886                      Mean   : 0.802         
##  3rd Qu.:1.000                      3rd Qu.: 1.000         
##  Max.   :1.000                      Max.   :20.000         
##  NA's   :7544                       NA's   :7544           
##  DebtToIncomeRatio         IncomeRange    IncomeVerifiable
##  Min.   : 0.000    $25,000-49,999:32192   False:  8669    
##  1st Qu.: 0.140    $50,000-74,999:31050   True :105268    
##  Median : 0.220    $100,000+     :17337                   
##  Mean   : 0.276    $75,000-99,999:16916                   
##  3rd Qu.: 0.320    Not displayed : 7741                   
##  Max.   :10.010    $1-24,999     : 7274                   
##  NA's   :8554      (Other)       : 1427                   
##  StatedMonthlyIncome                    LoanKey       TotalProsperLoans
##  Min.   :      0     CB1B37030986463208432A1:     6   Min.   :0.00     
##  1st Qu.:   3200     2DEE3698211017519D7333F:     4   1st Qu.:1.00     
##  Median :   4667     9F4B37043517554537C364C:     4   Median :1.00     
##  Mean   :   5608     D895370150591392337ED6D:     4   Mean   :1.42     
##  3rd Qu.:   6825     E6FB37073953690388BC56D:     4   3rd Qu.:2.00     
##  Max.   :1750003     0D8F37036734373301ED419:     3   Max.   :8.00     
##                      (Other)                :113912   NA's   :91852    
##  TotalProsperPaymentsBilled OnTimeProsperPayments
##  Min.   :  0.00             Min.   :  0.00       
##  1st Qu.:  9.00             1st Qu.:  9.00       
##  Median : 16.00             Median : 15.00       
##  Mean   : 22.93             Mean   : 22.27       
##  3rd Qu.: 33.00             3rd Qu.: 32.00       
##  Max.   :141.00             Max.   :141.00       
##  NA's   :91852              NA's   :91852        
##  ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
##  Min.   : 0.00                       Min.   : 0.00                  
##  1st Qu.: 0.00                       1st Qu.: 0.00                  
##  Median : 0.00                       Median : 0.00                  
##  Mean   : 0.61                       Mean   : 0.05                  
##  3rd Qu.: 0.00                       3rd Qu.: 0.00                  
##  Max.   :42.00                       Max.   :21.00                  
##  NA's   :91852                       NA's   :91852                  
##  ProsperPrincipalBorrowed ProsperPrincipalOutstanding
##  Min.   :    0            Min.   :    0              
##  1st Qu.: 3500            1st Qu.:    0              
##  Median : 6000            Median : 1627              
##  Mean   : 8472            Mean   : 2930              
##  3rd Qu.:11000            3rd Qu.: 4127              
##  Max.   :72499            Max.   :23451              
##  NA's   :91852            NA's   :91852              
##  ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
##  Min.   :-209.00             Min.   :   0.0           
##  1st Qu.: -35.00             1st Qu.:   0.0           
##  Median :  -3.00             Median :   0.0           
##  Mean   :  -3.22             Mean   : 152.8           
##  3rd Qu.:  25.00             3rd Qu.:   0.0           
##  Max.   : 286.00             Max.   :2704.0           
##  NA's   :95009                                        
##  LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination   LoanNumber    
##  Min.   : 0.00                 Min.   :  0.0              Min.   :     1  
##  1st Qu.: 9.00                 1st Qu.:  6.0              1st Qu.: 37332  
##  Median :14.00                 Median : 21.0              Median : 68599  
##  Mean   :16.27                 Mean   : 31.9              Mean   : 69444  
##  3rd Qu.:22.00                 3rd Qu.: 65.0              3rd Qu.:101901  
##  Max.   :44.00                 Max.   :100.0              Max.   :136486  
##  NA's   :96985                                                            
##  LoanOriginalAmount          LoanOriginationDate LoanOriginationQuarter
##  Min.   : 1000      2014-01-22 00:00:00:   491   Q4 2013:14450         
##  1st Qu.: 4000      2013-11-13 00:00:00:   490   Q1 2014:12172         
##  Median : 6500      2014-02-19 00:00:00:   439   Q3 2013: 9180         
##  Mean   : 8337      2013-10-16 00:00:00:   434   Q2 2013: 7099         
##  3rd Qu.:12000      2014-01-28 00:00:00:   339   Q3 2012: 5632         
##  Max.   :35000      2013-09-24 00:00:00:   316   Q2 2012: 5061         
##                     (Other)            :111428   (Other):60343         
##                    MemberKey      MonthlyLoanPayment LP_CustomerPayments
##  63CA34120866140639431C9:     9   Min.   :   0.0     Min.   :   -2.35   
##  16083364744933457E57FB9:     8   1st Qu.: 131.6     1st Qu.: 1005.76   
##  3A2F3380477699707C81385:     8   Median : 217.7     Median : 2583.83   
##  4D9C3403302047712AD0CDD:     8   Mean   : 272.5     Mean   : 4183.08   
##  739C338135235294782AE75:     8   3rd Qu.: 371.6     3rd Qu.: 5548.40   
##  7E1733653050264822FAA3D:     8   Max.   :2251.5     Max.   :40702.39   
##  (Other)                :113888                                         
##  LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees   
##  Min.   :    0.0              Min.   :   -2.35   Min.   :-664.87  
##  1st Qu.:  500.9              1st Qu.:  274.87   1st Qu.: -73.18  
##  Median : 1587.5              Median :  700.84   Median : -34.44  
##  Mean   : 3105.5              Mean   : 1077.54   Mean   : -54.73  
##  3rd Qu.: 4000.0              3rd Qu.: 1458.54   3rd Qu.: -13.92  
##  Max.   :35000.0              Max.   :15617.03   Max.   :  32.06  
##                                                                   
##  LP_CollectionFees  LP_GrossPrincipalLoss LP_NetPrincipalLoss
##  Min.   :-9274.75   Min.   :  -94.2       Min.   : -954.5    
##  1st Qu.:    0.00   1st Qu.:    0.0       1st Qu.:    0.0    
##  Median :    0.00   Median :    0.0       Median :    0.0    
##  Mean   :  -14.24   Mean   :  700.4       Mean   :  681.4    
##  3rd Qu.:    0.00   3rd Qu.:    0.0       3rd Qu.:    0.0    
##  Max.   :    0.00   Max.   :25000.0       Max.   :25000.0    
##                                                              
##  LP_NonPrincipalRecoverypayments PercentFunded    Recommendations   
##  Min.   :    0.00                Min.   :0.7000   Min.   : 0.00000  
##  1st Qu.:    0.00                1st Qu.:1.0000   1st Qu.: 0.00000  
##  Median :    0.00                Median :1.0000   Median : 0.00000  
##  Mean   :   25.14                Mean   :0.9986   Mean   : 0.04803  
##  3rd Qu.:    0.00                3rd Qu.:1.0000   3rd Qu.: 0.00000  
##  Max.   :21117.90                Max.   :1.0125   Max.   :39.00000  
##                                                                     
##  InvestmentFromFriendsCount InvestmentFromFriendsAmount   Investors      
##  Min.   : 0.00000           Min.   :    0.00            Min.   :   1.00  
##  1st Qu.: 0.00000           1st Qu.:    0.00            1st Qu.:   2.00  
##  Median : 0.00000           Median :    0.00            Median :  44.00  
##  Mean   : 0.02346           Mean   :   16.55            Mean   :  80.48  
##  3rd Qu.: 0.00000           3rd Qu.:    0.00            3rd Qu.: 115.00  
##  Max.   :33.00000           Max.   :25000.00            Max.   :1189.00  
## 

Univariate Plots Section:

##                ListingKey ListingNumber           ListingCreationDate
## 1 1021339766868145413AB3B        193129 2007-08-26 19:09:29.263000000
## 2 10273602499503308B223C1       1209647 2014-02-27 08:28:07.900000000
## 3 0EE9337825851032864889A         81716 2007-01-05 15:00:47.090000000
## 4 0EF5356002482715299901A        658116 2012-10-22 11:02:35.010000000
## 5 0F023589499656230C5E3E2        909464 2013-09-14 18:38:39.097000000
##   CreditGrade Term LoanStatus          ClosedDate BorrowerAPR BorrowerRate
## 1           C   36  Completed 2009-08-14 00:00:00     0.16516       0.1580
## 2               36    Current                         0.12016       0.0920
## 3          HR   36  Completed 2009-12-17 00:00:00     0.28269       0.2750
## 4               36    Current                         0.12528       0.0974
## 5               36    Current                         0.24614       0.2085
##   LenderYield EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## 1      0.1380                      NA            NA              NA
## 2      0.0820                 0.07960        0.0249         0.05470
## 3      0.2400                      NA            NA              NA
## 4      0.0874                 0.08490        0.0249         0.06000
## 5      0.1985                 0.18316        0.0925         0.09066
##   ProsperRating..numeric. ProsperRating..Alpha. ProsperScore
## 1                      NA                                 NA
## 2                       6                     A            7
## 3                      NA                                 NA
## 4                       6                     A            9
## 5                       3                     D            4
##   ListingCategory..numeric. BorrowerState    Occupation EmploymentStatus
## 1                         0            CO         Other    Self-employed
## 2                         2            CO  Professional         Employed
## 3                         0            GA         Other    Not available
## 4                        16            GA Skilled Labor         Employed
## 5                         2            MN     Executive         Employed
##   EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## 1                        2                True             True
## 2                       44               False            False
## 3                       NA               False             True
## 4                      113                True            False
## 5                       44                True            False
##                  GroupKey              DateCreditPulled
## 1                         2007-08-26 18:41:46.780000000
## 2                                   2014-02-27 08:28:14
## 3 783C3371218786870A73D20 2007-01-02 14:09:10.060000000
## 4                                   2012-10-22 11:02:32
## 5                                   2013-09-14 18:38:44
##   CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## 1                   640                   659     2001-10-11 00:00:00
## 2                   680                   699     1996-03-18 00:00:00
## 3                   480                   499     2002-07-27 00:00:00
## 4                   800                   819     1983-02-28 00:00:00
## 5                   680                   699     2004-02-20 00:00:00
##   CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## 1                  5               4                         12
## 2                 14              14                         29
## 3                 NA              NA                          3
## 4                  5               5                         29
## 5                 19              19                         49
##   OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## 1                     1                          24                    3
## 2                    13                         389                    3
## 3                     0                           0                    0
## 4                     7                         115                    0
## 5                     6                         220                    1
##   TotalInquiries CurrentDelinquencies AmountDelinquent
## 1              3                    2              472
## 2              5                    0                0
## 3              1                    1               NA
## 4              1                    4            10056
## 5              9                    0                0
##   DelinquenciesLast7Years PublicRecordsLast10Years
## 1                       4                        0
## 2                       0                        1
## 3                       0                        0
## 4                      14                        0
## 5                       0                        0
##   PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## 1                         0                      0                0.00
## 2                         0                   3989                0.21
## 3                        NA                     NA                  NA
## 4                         0                   1444                0.04
## 5                         0                   6193                0.81
##   AvailableBankcardCredit TotalTrades TradesNeverDelinquent..percentage.
## 1                    1500          11                               0.81
## 2                   10266          29                               1.00
## 3                      NA          NA                                 NA
## 4                   30754          26                               0.76
## 5                     695          39                               0.95
##   TradesOpenedLast6Months DebtToIncomeRatio    IncomeRange
## 1                       0              0.17 $25,000-49,999
## 2                       2              0.18 $50,000-74,999
## 3                      NA              0.06  Not displayed
## 4                       0              0.15 $25,000-49,999
## 5                       2              0.26      $100,000+
##   IncomeVerifiable StatedMonthlyIncome                 LoanKey
## 1             True            3083.333 E33A3400205839220442E84
## 2             True            6125.000 9E3B37071505919926B1D82
## 3             True            2083.333 6954337960046817851BCB2
## 4             True            2875.000 A0393664465886295619C51
## 5             True            9583.333 A180369302188889200689E
##   TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## 1                NA                         NA                    NA
## 2                NA                         NA                    NA
## 3                NA                         NA                    NA
## 4                NA                         NA                    NA
## 5                 1                         11                    11
##   ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## 1                                  NA                              NA
## 2                                  NA                              NA
## 3                                  NA                              NA
## 4                                  NA                              NA
## 5                                   0                               0
##   ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## 1                       NA                          NA
## 2                       NA                          NA
## 3                       NA                          NA
## 4                       NA                          NA
## 5                    11000                      9947.9
##   ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## 1                          NA                         0
## 2                          NA                         0
## 3                          NA                         0
## 4                          NA                         0
## 5                          NA                         0
##   LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## 1                            NA                         78      19141
## 2                            NA                          0     134815
## 3                            NA                         86       6466
## 4                            NA                         16      77296
## 5                            NA                          6     102670
##   LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## 1               9425 2007-09-12 00:00:00                Q3 2007
## 2              10000 2014-03-03 00:00:00                Q1 2014
## 3               3001 2007-01-17 00:00:00                Q1 2007
## 4              10000 2012-11-01 00:00:00                Q4 2012
## 5              15000 2013-09-20 00:00:00                Q3 2013
##                 MemberKey MonthlyLoanPayment LP_CustomerPayments
## 1 1F3E3376408759268057EDA             330.43            11396.14
## 2 1D13370546739025387B2F4             318.93                0.00
## 3 5F7033715035555618FA612             123.32             4186.63
## 4 9ADE356069835475068C6D2             321.45             5143.20
## 5 36CE356043264555721F06C             563.97             2819.85
##   LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## 1                      9425.00            1971.14        -133.18
## 2                         0.00               0.00           0.00
## 3                      3001.00            1185.63         -24.20
## 4                      4091.09            1052.11        -108.01
## 5                      1563.22            1256.63         -60.27
##   LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## 1                 0                     0                   0
## 2                 0                     0                   0
## 3                 0                     0                   0
## 4                 0                     0                   0
## 5                 0                     0                   0
##   LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## 1                               0             1               0
## 2                               0             1               0
## 3                               0             1               0
## 4                               0             1               0
## 5                               0             1               0
##   InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## 1                          0                           0       258
## 2                          0                           0         1
## 3                          0                           0        41
## 4                          0                           0       158
## 5                          0                           0        20

In the next sections, I will explore some variables of the data set, as described bellow:

* Term: The length of the loan expressed in months;
* LoanStatus: The current status of the loan, as Cancelled, Defaulted, etc;
* ClosedDate: Closed date, when applicable;
* BorrowerRate: The Borrower's interest rate for the loan;
* ProsperRating (numeric): The Risk Rating assigned at the time the listing was created. Varies from 0 (worst) to 7 (best);
* ListingCategory: The category of the listing that the borrower selected when posting their listing;
* IsBorrowerHomeowner: If the Borrower is classified as a homeowner by the Prosper Criteria;
* DebtToIncomeRatio: The debt to income ratio of the borrower at the time the credit profile was pulle. This value is capped at 10.01;
* StatedMonthlyIncome: The monthly income the borrower stated at the time the listing was created;
* LoanOriginalAmount: The origination amount of the loan;
* LoanOriginationDate: The date the loan was originated.

Before we begin the Analysis, Let’s see the Distribution of the Variables we interested in. The Distribution of Loan Amouts:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6500    8337   12000   35000

Mainly the Prosper Loans are are under 10000 Dollars amount. Most loan amount people are taking is about $4000

The Distribution of DebtToIncomeRatio:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   0.140   0.220   0.276   0.320  10.010    8554

As we see that most ratio of Dept to the income are under 2.0.We can state that people dept less than 20% of their income. It’s also obvious that there’s outlier ratio vlaue = 10.0 which I belive it’s a valid data point.no need to remove it.

The Distribution Borrower Income Range:

IncomeRange are normally distribution, and most borrowers incomes range between 25000-49999. There are few borrowers are unemployed.

The Distribution Loan Investors:

The number Of Loans in each Credit Grade .

## 
##           A    AA     B     C     D     E    HR    NC 
## 84984  3315  3509  4389  5649  5153  3289  3508   141

Numbers of loans with Grade [C] takes the First place among other Grades.

What Were The Reasons For The Loans ? I’m going to analyze the Category of the Loans to see the Category many People are taking loans for.

## 
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 16965 58308  7433  7189  2395   756  2572 10494   199    85    91   217 
##    12    13    14    15    16    17    18    19    20 
##    59  1996   876  1522   304    52   885   768   771

The most number of loans came from Debt consolidation loans, then home improvement and bussines loans in the second place.

q2) what are the borrower sates ? In the blew I’m going to explore the BorrowerState vairable ,and see the number of loans in each state.

We can see that most loan borrowers are heavily concentrated in CA stat.Then TX and NY in the scond place

How many completed Prosper loans ?

As shown above its seems that number of completed loans good enough,it’s about 30000 loans.Moreover,The Most number of Loans are unser ‘Current’ Status.

  1. What is the employment status of people taking loans ?

As shown above most borrower are Employed in general ,and there are many that are currently with a full time job.

  1. What is the number of loans in each Prosper Rating? and the amout of loans in each ProsperRating ?

  1. Is the rating affect the amount of loan people take ?In other words, borrower with high rating takes higher or lower loan amount? here we exploring the amout of loans for the prosper rating.

Higher amount are Borrowers with A/B/C Ratings.

There’s no noticable difference betweet the average homeowner Borrowers and nonhomeowner Borrowers.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   131.6   217.7   272.5   371.6  2251.5

Most people pay between $100-200 each month for their loans.This seems a good range for the employment people.

It looks like most borrower are under (Professional) in general.

Univariate Analysis

What is the structure of your dataset?

we’re exploring Prosper’s peer-to-peer lending data collected. Prosper is the first peer-to-peer lending marketplace, and fortunately for us, they maintain a full public database of all loans where we can analyze the performance of any subset of loans.

This data set contains 113,937 loans with 81 variables on each loan, including:

loan amount
borrower rate (or interest rate)
current loan status
borrower income
borrower employment status
the latest payment information among others.

What is/are the main feature(s) of interest in your dataset?

There are a number of features that will give color to the profile of the loan and borrower. I think the main ones for borrowers are borrower employment status,ishomewoner,income range, occupation, debt to income ratio, income and the ones for the loans are loan amount, interest rate, and term.

What other features in the dataset do you think will help support your
investigation into your feature(s) of interest?

Loan category and status can determind the indention of the loan

Did you create any new variables from existing variables in the dataset?

No,There’re many Vairables in the dataset,and also they are informative vairables and can lead to a great analysis resulte.

Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form
of the data? If so, why did you do this?

I didn’t make a change on the data,I keep it as it’s.yet I have invistgated some outliers listed above and decided not to remove them.it’s need domain knowledge and more invistgation of the source of the outliers.

In genral, almost borrower are employed. Suprisingly, there are some borrower without employment from Prosper.They are both homeowner Borrowers and nonhomeowner they both tend to have the same number of loans. The reasons for borrowers are Debt consolidation loans, then home improvement and bussines loans.The majority borrowers??? income are in range 25000-49999.

Bivariate Plots Section

I will start by seeing which variables are correlated in the data set.

In the above visualization, I analyzed the coloration of subset of the data which the vairables I’m interested in.There are lots of variables correlated. The most correlated variables (corr = 0.9) are loan origination amount and monthly loan payment.In the first place ,The coloration of BorrowerAPR and BorrowerRate are 1 that’s becouse those vairables indicate the same kind of rate ,and they have very strong positive correlation.Also,Both loan origination amount and monthly loan payment are correlated with Investors (corr=04), which is postive value that indicate both variables move in the same direction.Besides the postive correlated vairables,Prosper score is negatively correlated with Borrower rate and DebtToIncomeRatio.And Borrower rate is negatively correlated with Prosper score and credit score.

As showen above, borrower with AA/A/B/C rating takes higher loan amounts.We can say people with higher ProsperRating take higher loan amount.There’re outliers in each Ratings that may opposite of my conclusion. We concluded in univariate that the minimum loan amount is 1000 ,and most people loan amount is between 1000 and 20000.

  1. Is there any indication that borrower with higher income have higher Borrower APR?

As showen above,It seems that the income range have no affect to the borrower APR .Moreover,there’s number of outlier in ‘100,000+’ income range and ‘not displayed’ Borrower APR is mostly within 0.1 and 0.3 %.

There are negitve linear relation between Borrower APR and Monthly Income , We can see that the points above become less when Borrower APR are above (0.3).Moreover, The coloration is (-2) as showen in the correlation matrix visualization above.

As showen above when the number of Investors increas the loan amount is increasing,this means that the loan by many investers tend to become higher loan amount.Thus,we can say that there’s no high loan amount with small number of investers.

  1. Is homeowner are likly to pay the loan ?

There is no clear evidence that homeowners are having higher BorrowerRate loans.

## Q1 2006 Q1 2007 Q1 2008 Q1 2010 Q1 2011 Q1 2012 Q1 2013 Q1 2014 Q2 2006 
##     315    3079    3074    1243    1744    4435    3616   12172    1254 
## Q2 2007 Q2 2008 Q2 2009 Q2 2010 Q2 2011 Q2 2012 Q2 2013 Q3 2006 Q3 2007 
##    3118    4344      13    1539    2478    5061    7099    1934    2671 
## Q3 2008 Q3 2009 Q3 2010 Q3 2011 Q3 2012 Q3 2013 Q4 2005 Q4 2006 Q4 2007 
##    3602     585    1270    3093    5632    9180      22    2403    2592 
## Q4 2008 Q4 2009 Q4 2010 Q4 2011 Q4 2012 Q4 2013 
##     532    1449    1600    3913    4425   14450

Seeing the loan amount over the years from 2006 to 2014 and in each quarter of the year.It appears that the amount of loans is increasing over the years and in almost many cases the amount increas over the quarter.

The above chart emphasize one of the previous charts that the most loan monthly payment around $100-200 even when we split by LoanStatus.

It seems that the employed borrowers and full time borrowers have highest monthly payment where the unemployed and part-time have the lowest.people with regulare income and salary tend to pay more monthly.

Administrative and homemaker seems to have highest median return among other Borrower.

ggplot(aes(x=ProsperRating),data = pld) +
geom_jitter(aes(colour = ListingCategory),stat="count")+
ggtitle('Number of Loans in each Category') +
xlab('Listing Category') +
ylab('Number of Loans')

There are a lot of loans that were taken for Debt Consolidation on all Risk levels.

Bivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. How did the feature(s) of interest vary with other features in
the dataset?

In my analysis in this section I have chooseden to see the relationships between two vairables. The original laon amount and ProsperRating has clear relationships when one of them is has high value the other one has high value as well.Also the Loan original amount is colorated with monthly payment and people pay more monthly when the loan amount is bigger. I have noticed that when the loan amount is larg this mean many investers are invested in this loan.

Did you observe any interesting relationships between the other features
When seeing the monthly payment amount ,Employed people tend to pay more monthly among others.

Prosper loan amount is growing over the years which indecate it’s growing successful business

What was the strongest relationship you found?

The strongest relationship I found was between Loan Original Amount and Monthly payment variable .Also ProsperRating and Monthly payment has strong postive relationship.

Multivariate Plots Section

As you may know, prosper score is the custom risk 1-10 where 1 risk score.It’s appear that when it’s high risk score (best value is 10) the yielders gain less rate.Also We can see (Employed & Self-employed) are the most borrower.

Mainly we can see the DebtToIncomeRatio of the employed borrower is dereasing when score gets higher after prosperscore=3.

The amount of the loan is increasing from 2006 to 2013 , The color indecating the borrowers with (100000+) income take the highes amount in each quarter of a year.

It’s shows strong relation between original amount of the loan and the borrower monthly loan payment. The the length of loan Term have an affect to the monthly payment.

Multivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. Were there features that strengthened each other in terms of
looking at your feature(s) of interest?

I looked at loan amounts vs income ranges and saw that the higher the income, the larger the loans amounts on average over the years and each quarter of the year.

Were there any interesting or surprising interactions between features?

The relationship between Prosper score and Lender Yield has an inversed relationship. The higher the score, the lower lender yield for all borrowers levels( employed, self-employed, etc). The DebtToIncomeRatio of the employed borrower is dereasing when score gets higher after.

OPTIONAL: Did you create any models with your dataset? Discuss the
strengths and limitations of your model.

Final Plots and Summary

Plot One

Description One

The types of browers are Employed,full-time,not available, not employed,other ,retired ,and self-employed. What I observe is the employed borrowers and full time borrowers have highest monthly payment and then self-employement came and it has good monthly payment amount.Yet,unemployed and part-time have the lowest I found this information is good for the Prosper platform so they arrange the monthly payment for Employment and unemployemnt separately.

Plot Two

Description Two

Emphasizing on the strong relationship between Laon original amount and monthly payment , the monthly payment is depending on the loan original amount factor and when it’s huge amount the laon monthly amount is means the loan amount is big.Also The length of the loans affected by the loan original amount.36-month and 60-month length of the loans that is high in amount 12-month for less amount. It’s interesting to see the amount of the loan affecting two facters in the Prosoer platform , one is the Term of the length of the loan and the other is the monthly payment.Also such an info is important to know the behiviare of the big and small loans amounts in their system.

Plot Three

Description Three

In The above visualization there are three factors I’m investing the relationship among them.seeing The affect of income range on the amount and weather this afect is over the year and each quarter of the year. first information we all catched is the amount of loan is the highest in 2013 and 14 . each Income range reperesent a light green color in the amount bar which takes the highest space in the bar. we concluded that the amount is increasing over the year and quarter and people with income (100,000+) reperest the high value of the loan amount. Represent this information can guide the Prosper platform how they’re doing over the years and which borrower income range help them in reaching the amount of such a loan.

Reflection

Summary

I was pleased with how easy and fun to Analyze Loans dataset in R.The plotting and development are straight forward,yet I spent a some of time reading R documentation for the advanced plots specially in multivariate analysis.Also Since I have zero domain knowlege about this area and loans specially, It took me a lot of time just to understand the meaning of all the variables in the list and figure out what vairables I want to analyze.Nevertheless,Important knowlege about the loans and many features around it has been gained.I was able to determine the borrowers profile and it’s relationship with the loans.

As I progress through the analysis,I started by getting idea of the loans and borrowers profile.seaing the disturbution of the data at first help me get the full picture.Then moving to the bivariate analysis where I started this by producing the coloration among selected features.and then ploting the features using scatter plot for most of the cases.Finally, I decided to add more feature to bivariate charts to start the Multivariate analysis.It’s simple method yet very imprtant to get great insights.

Limitation and Future Improvment

limitation of the analysis besides the lack of domain knowlege that may mislead of which vairables to invistigate.I couldn’t decided to remove the outliers from the datafram , I belive deciding weather or not removing the outlier need more knowlege and time.In addition, There are some cases where using advanced visualization packages are prefered this would be considered for further Improvment.

For Further enhancement in the analysis, I would like to build a regression model using the variables to predict the Prosper score , BorrowerAPR and estimate the completion of the loans time and the different from the original amount to the benifits.For relationships among variables I would like to conduct statistical test to produce the final statment.